Simplifying the mosaic description of DNA sequences.

نویسندگان

  • Rajeev K Azad
  • J Subba Rao
  • Wentian Li
  • Ramakrishna Ramaswamy
چکیده

By using the Jensen-Shannon divergence, genomic DNA can be divided into compositionally distinct domains through a standard recursive segmentation procedure. Each domain, while significantly different from its neighbors, may, however, share compositional similarity with one or more distant (non-neighboring) domains. We thus obtain a coarse-grained description of the given DNA string in terms of a smaller set of distinct domain labels. This yields a minimal domain description of a given DNA sequence, significantly reducing its organizational complexity. This procedure gives a new means of evaluating genomic complexity as one examines organisms ranging from bacteria to human. The mosaic organization of DNA sequences could have originated from the insertion of fragments of one genome (the parasite) inside another (the host), and we present numerical experiments that are suggestive of this scenario.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Association of Tomato Leaf Curl New Delhi Virus, Betasatellite, and Alphasatellite with Mosaic Disease of Spine Gourd (Momordica dioica Roxb. Willd) in India

Background: Spine gourd (Momordica dioica Roxb. Willd) is one of the important cucurbitaceous crops grown across the world for vegetable and medicinal purposes. Diseases caused by the DNA viruses are becoming the limiting factors for the production of spine gourd reducing its potential yield. For the commercial cultivation of the spine gourd, propagation material used by most o...

متن کامل

Complete nucleotide sequence and host range of South African cassava mosaic virus: further evidence for recombination amongst begomoviruses.

Complete nucleotide sequences of the DNA-A (2800 nt) and DNA-B (2760 nt) components of a novel cassava-infecting begomovirus, South African cassava mosaic virus (SACMV), were determined and compared with various New World and Old World begomoviruses. SACMV is most closely related to East African cassava mosaic virus (EACMV) in both its DNA-A (85% with EACMV-MH and -MK) and -B (90% with EACMV-UG...

متن کامل

MOSAIC: segmenting multiple aligned DNA sequences

UNLABELLED MOSAIC is a set of tools for the segmentation of multiple aligned DNA sequences into homogeneous zones. The segmentation is based on the distribution of mutational events along the alignment. As an example, the analysis of one repeated sequence belonging to the subtelomeric regions of the yeast genome is presented. AVAILABILITY Free access from ftp://ftp.biomath.jussieu.fr/pub/pape...

متن کامل

Cauliflower mosaic virus 35S promoter-controlled DNA copies of cowpea mosaic virus RNAs are infectious on plants.

Clones have been constructed that contain full-length cDNA copies of cowpea mosaic virus RNA1 and RNA2, downstream of the cauliflower mosaic virus 35S promoter. The clones, when linearized downstream of the viral sequences, give rise to cowpea mosaic virus-like symptoms when inoculated onto cowpea plants. Viral RNA and virions can be detected in the inoculated plants, demonstrating that the clo...

متن کامل

Quantification of DNA patchiness using long-range correlation measures.

We introduce and develop new techniques to quantify DNA patchiness, and to quantify characteristics of its mosaic structure. These techniques, which involve calculating two functions, alpha(l) and beta(l), measure correlations at length scale l and detect distinct characteristic patch sizes embedded in scale-invariant patch size distributions. Using these new methods, we address a number of iss...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Physical review. E, Statistical, nonlinear, and soft matter physics

دوره 66 3 Pt 1  شماره 

صفحات  -

تاریخ انتشار 2002